Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
null (Ed.)Using machine learning (ML) to develop quantitative structure—activity relationship (QSAR) models for contaminant reactivity has emerged as a promising approach because it can effectively handle non-linear relationships. However, ML is often data-demanding, whereas data scarcity is common in QSAR model development. Here, we proposed two approaches to address this issue: combining small datasets and transferring knowledge between them. First, we compiled four individual datasets for four oxidants, i.e., SO4•-, HClO, O3 and ClO2, each dataset containing a different number of contaminants with their corresponding rate constants and reaction conditions (pH and/or temperature). We then used molecular fingerprints (MF) or molecular descriptors (MD) to represent the contaminants; combined them with ML algorithms to develop individual QSAR models for these four datasets; and interpreted the models by the Shapley Additive exPlantion (SHAP) method. The results showed that both the optimal contaminant representation and the best ML algorithm are dataset dependent. Next, we merged these four datasets and developed a unified model, which showed better predictive performance on the datasets of HClO, O3 and ClO2 because the model ‘corrected’ some wrongly learned effects of several atom groups. We further developed knowledge transfer models based on the second approach, the effectiveness of which depends on if there is consistent knowledge shared between the two datasets as well as the predictive performance of the respective single models. This study demonstrated the benefit of combining small similar datasets and transferring knowledge between them, which can be leveraged to boost the predictive performance of ML-assisted QSAR models.more » « less
-
Abstract The prognostic and therapeutic value of the tumor microenvironment (TME) in various cancer types is of major interest. Characterization of the TME often relies on a small representative tissue sample. However, the adequacy of such a sample for assessing components of the TME is not yet known. Here, we used immunohistochemical (IHC) staining and 7-color multiplex staining to evaluate CD8 (cluster of differentiation 8), CD68, PD-L1 (programmed death-ligand 1), CD34, FAP (fibroblast activation protein), and cytokeratin in 220 tissue cores from 26 high-grade serous ovarian cancer samples. Comparisons were drawn between a larger tumor specimen and smaller core biopsies based on number and location (central tumor vs. peripheral tumor) of biopsies. Our analysis found that the correlation between marker-specific cell subsets in larger tumorversussmaller core was stronger with two core biopsies and was not further strengthened with additional biopsies. Moreover, this correlation was consistently strong regardless of whether the biopsy was taken at the center or at the periphery of the original tumor sample. These findings could have a substantial impact on longitudinal assessment for detection of biomarkers in clinical trials.more » « less
An official website of the United States government

Full Text Available